@gen«meA 

Journals.ASM.org 

Draft Genome Sequence of the Mycobacterium tuberculosis Clinical 
Isolate C2, Belonging to the Latin American-Mediterranean Family 

Yu-Chieh Liao, b Yih-Yuan Chen, 3 Hsin-Hung Lin, b Jia-Ru Chang, 3 Ih-Jen Su, 3 Tsi-Shu Huang, c Horng-Yunn Dou 3 

National Institute of Infectious Diseases and Vaccinology, National Health Research Institutes, Zhunan, Miaoli, Taiwan"; Division of Biostatistics and Bioinformatics, Institute 
of Population Health, National Health Research Institutes, Zhunan, Miaoii, Taiwan b ; Department of Microbiology, Kaohsiung Veterans General Hospital, Kaohsiung, 
Taiwan c 

Tuberculosis remains a major infectious disease in Taiwan. Here we present the draft genome sequence of the Mycobacterium 
tuberculosis C2 strain, belonging to the Latin American-Mediterranean lineage. The draft genome sequence comprises 
4,453,307 bp with a G+C content of 65.6%, revealing 4,390 coding genes and 45 tRNA genes. 
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Tuberculosis is a leading notifiable communicable disease in 
Taiwan. Based on molecular methods, six distinct clades, Bei- 
jing, Haarlem, East-African Indian (EAI), Latin American-Medi- 
terranean (LAM), U, and T lineages, were isolated. Epidemiolog- 
ical surveillance indicated that the Beijing lineage was the most 
prevalent strain in Taiwan, followed by EAI and Haarlem strains 
(1, 2). The percentages of isolated LAM strains were around 1% to 
7% (2, 3). 

Here we report the complete genome sequence of a clinical 
isolate of Mycobacterium tuberculosis strain C2, isolated at Kaoh- 
siung Veterans General Hospital, Taiwan, from human sputum. 
Sputum microscopy and culture samples from the patient tested 
positive for tuberculosis. The genomic DNA of isolate C2 was 
extracted from cultured cells as described previously (4), and was 
sequenced using a MiSeq platform (Illumina, San Diego, CA, USA). 
Strain C2 was also analyzed by spoligotyping (33777760776077 1 ) and 
mycobacterial interspersed repetitive unit-variable number tandem 
repeat (MIRU-VNTR) typing (2241243261532241144132250) and 
was identified as a member of the LAM family. The LAM lineage 
was one of six major clades of M. tuberculosis strains identified in 
Taiwan. LAM originating in Europe and America may have first 
occurred during the Portuguese reign in the 16th century and 
been passed on to the natives of Taiwan. 

This study was approved by the Human Ethics Committee 
of the National Health Research Institutes, Taiwan (code 
EC1010804-E). Because of the retrospective nature, routine col- 
lection of clinical data in daily practice, and dislinkage of personal 
information, the requirement to obtain informed consent was 
waived by our institutional review board. 

A total of 8,259,434 paired-end reads of 251 bp in length, 
with an average insert size of 289 bp, were produced. The se- 
quencing reads were trimmed and discarded by limiting the 
quality score to 0.05 and permitting at most two ambiguous 
nucleotides in the minimum length of 50 bp. Five different 
genome assemblies generated separately by using Abyss vl.3.4 
(5), Edenav3. 130110 (6), SPAdes v2.5.0 (7), and Velvet vl.2.09 



(8), were subsequently integrated into an assembly by using 
CISA (9). The assembly resulted in 85 contigs (>200 bp) atN 50 
value 168,084 bp and a total assembly of 4,453,307 bp, with a 
G+C content of 65.6%. The draft genome was annotated using 
the RAST server (10) resulting in a total of 4,390 coding genes and 
45 tRNA genes. 

Nucleotide sequence accession numbers. This whole-genome 
shotgun project has been deposited at DDBJ/EMBL/GenBank un- 
der accession no. JHAD00000000. The version described in this 
paper is the first version, JHAD01000000. 
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